Skip to content

fix(datetime): tighten gen and whitelist#7

Merged
2010YOUY01 merged 1 commit into
datafusion-contrib:mainfrom
kumarUjjawal:fix/datetime-generator-whitelist-tests
Apr 19, 2026
Merged

fix(datetime): tighten gen and whitelist#7
2010YOUY01 merged 1 commit into
datafusion-contrib:mainfrom
kumarUjjawal:fix/datetime-generator-whitelist-tests

Conversation

@kumarUjjawal
Copy link
Copy Markdown
Contributor

Summary

This PR fixes two datetime-related fuzzer issues.

First, it tightens to_unixtime generation so format arguments are only generated for string input, which matches the function contract.

Second, it expands the whitelist for expected datetime parse failures from generated format strings.

Why

The fuzzer was still producing avoidable NoCrash noise from datetime functions.

Examples:

  • generating to_unixtime(timestamp, format) even though extra format arguments only work with string input
  • treating expected parse failures from random to_date / to_timestamp / to_unixtime format strings as non-whitelisted errors

These are generator or whitelist issues, not real engine crashes.

What changed

  • fix to_unixtime argument generation
  • whitelist expected timestamp/date parse errors for generated datetime format cases
  • add tests for the new whitelist rules

Copy link
Copy Markdown
Collaborator

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

use super::is_error_whitelisted;

#[test]
fn whitelists_timestamp_parse_errors_for_to_timestamp_queries() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up idea: I think an alternative testing strategy is: running this query on datafusion, and assert the expected error is returned.
Once DF is updated with different error messages, we can catch them from UTs and directly update the whitelist, which can be easier to investigate comparing to the fuzzer oracle inconsistencies.

TypeGroup::OneOf(vec![FuzzerDataType::String.to_datafusion_type()]),
]],
])],
vec![
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea to try to generate valid signature at this level.

If we want to inject more randomness to generate invalid exprs, we can do that at the expr-generation layer.

@2010YOUY01 2010YOUY01 merged commit 063c4b6 into datafusion-contrib:main Apr 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants